A Composite Model for Subgroup Identification and Prediction via Bicluster Analysis

نویسندگان

  • Hung-Chia Chen
  • Wen Zou
  • Tzu-Pin Lu
  • James J. Chen
چکیده

BACKGROUND A major challenges in the analysis of large and complex biomedical data is to develop an approach for 1) identifying distinct subgroups in the sampled populations, 2) characterizing their relationships among subgroups, and 3) developing a prediction model to classify subgroup memberships of new samples by finding a set of predictors. Each subgroup can represent different pathogen serotypes of microorganisms, different tumor subtypes in cancer patients, or different genetic makeups of patients related to treatment response. METHODS This paper proposes a composite model for subgroup identification and prediction using biclusters. A biclustering technique is first used to identify a set of biclusters from the sampled data. For each bicluster, a subgroup-specific binary classifier is built to determine if a particular sample is either inside or outside the bicluster. A composite model, which consists of all binary classifiers, is constructed to classify samples into several disjoint subgroups. The proposed composite model neither depends on any specific biclustering algorithm or patterns of biclusters, nor on any classification algorithms. RESULTS The composite model was shown to have an overall accuracy of 97.4% for a synthetic dataset consisting of four subgroups. The model was applied to two datasets where the sample's subgroup memberships were known. The procedure showed 83.7% accuracy in discriminating lung cancer adenocarcinoma and squamous carcinoma subtypes, and was able to identify 5 serotypes and several subtypes with about 94% accuracy in a pathogen dataset. CONCLUSION The composite model presents a novel approach to developing a biclustering-based classification model from unlabeled sampled data. The proposed approach combines unsupervised biclustering and supervised classification techniques to classify samples into disjoint subgroups based on their associated attributes, such as genotypic factors, phenotypic outcomes, efficacy/safety measures, or responses to treatments. The procedure is useful for identification of unknown species or new biomarkers for targeted therapy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PREDICTION OF BIAXIAL BENDING BEHAVIOR OF STEEL-CONCRETE COMPOSITE BEAM-COLUMNS BY ARTIFICIAL NEURAL NETWORK

In this study, the complex behavior of steel encased reinforced concrete (SRC) composite beam–columns in biaxial bending is predicted by multilayer perceptron neural network. For this purpose, the previously proposed nonlinear analysis model, mixed beam-column formulation, is verified with biaxial bending test results. Then a large set of benchmark frames is provided and P-Mx-My triaxial ...

متن کامل

Analytical Prediction of Indentation and Low-Velocity Impact Responses of Fully Backed Composite Sandwich Plates

In this paper, static indentation and low velocity impact responses of a fully backed composite sandwich plate subjected to a rigid flat-ended cylindrical indenter/impactor are analytically investigated. The analysis is nonlinear due to nonlinear strain-displacement relation. In contrast to the existed analytical models for the indentation of composite sandwich plates, the stacking sequence of ...

متن کامل

Recent patents on biclustering algorithms for gene expression data analysis.

In DNA microarray experiments, discovering groups of genes that share similar transcriptional characteristics is instrumental in functional annotation, tissue classification and motif identification. However, in many situations a subset of genes only exhibits a consistent pattern over a subset of conditions. Although used extensively in gene expression data analysis, conventional clustering alg...

متن کامل

Low Velocity Impact Damage Prediction in Laminated Composite Plates

In this paper, a finite element model is presented for the transient analysis of low velocity impact, and the impact induced damage in the composite plate subjected to low velocity impact is studied. The failure criteria suggested by Choi and Chang and the Tsai-Hill failure criteria are used for the prediction of the damage in the composite plate; then the effect of various parameters on the im...

متن کامل

Artificial Neural Network Based Prediction Hardness of Al2024-Multiwall Carbon Nanotube Composite Prepared by Mechanical Alloying

In this study, artificial neural network was used to predict the microhardness of Al2024-multiwall carbon nanotube(MWCNT) composite prepared by mechanical alloying. Accordingly, the operational condition, i.e., the amount of reinforcement, ball to powder weight ratio, compaction pressure, milling time, time and temperature of sintering as well as vial speed were selected as independent input an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014